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PROCESSOR HAVING MULTIPLE INSTRUCTION REGISTERS 



The present Invention relates to a processor that may 
be a micro processing unit (MPU) with an internal or 
external program memory, a digital signal processor (DSP) 
with an internal or external program memory or the like. 

These types of processors perform pipeline processing 
in order to speed up processing. In pipeline processing in 
the prior art, an instruction queue comprising, for example, 
6 stages of registers is connected to the front stag« of a 
decoder and a queue with the same number of stages is 
connected to the rear stage of this decoder. Since, when 
the pipeline has settled Into the stationary state, one 
normal instruction can be executed in one cycle, high-speed 
processing is possible. 

However, with instructions that require nprocessing 
different from that in normal instructions, such as branch 
Instructions, immediate data transfer instructions or 
variable length Instructions, the processing speed is 
reduced as described below. 

(1) In the case of a branch Instruction, since it 
changes the execution sequence of the instructions, 
instructions that have been partially processed have *>een 
discarded and it is necessary to start anew from the 
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instruction fetch, cancelling out the benefits of the 

pipeline processing. 

Therefore, branch prediction may be performed for the 
branch instruction by connecting the Instruction at the 
branch destination In front of the branch Instruction and 
reading It Into the pipeline. However, this Induces the 
structure of the compiler, which performs the branch 
prediction, complicated . Also, under certain conditions 
the branching will not occur, and since the Instruction at 
the branch destination will stUX be executed, though It Is 
not necessary, the processing speed Is reduced. 

Another approach eliminates dead time by Inserting the 
instruction to be executed before a conditional branch 
instruction In rear of the conditional branch Instruction as 
a delay slot and by executing this delay slot while the 
branch destination is being determined. However, this 
method too. induces the compiler that Inserts the delay slot 

if delay slot cannot be 
more complicated and also, if a delay s 

inserted, the processing speed is reduced. 

(2) in the case of an Immediate data transfer 
instruction, time is required for the calculation of 
execution address and for memory access. This problem can 
be overcome and processing can be speeded up by using an 
l^edlate data transfer Instruction which places the data 
mside an instruction word. However, since an immediate data 
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transfer instruction must wait for the intake of the 
immediate data, the execution needs a plurality of cycles, 
thus reducing the processing speed. 

(3) in the case of multiple length instructions, it is 
necessary to perform decoding again after the multiple 
lengths are compounded, thus the execution ne^ds a plurality 
of cycles, reducing the processing speed - 



Accordingly, an object of the present invention is to 
provide a processor which can speed up the processing 
without complicating the compiler. 

in accordance with the invention, there is provided 
a processor comprising for each i that is 1 to n= an i-th 
program counter; i-th memory means for being addressed with 
the output from the i- ' 
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th program counter; and an i-th Instruction register for 
holding the outputs from the i-th memory means: the 
processor further comprising: an instruction decoder for 
selecting one of the outputs from the 1-st to n-th 
instruction registers and for decoding the selected output; 
an execution circuit for executing processing based upon the 
output from the instruction decoder; and a control circuit 
for inducing the instruction decoder to select the outputs 
from the 1-st to n-th instruction registers sequentially, 
for Inducing the 1-th program counter to update after the 
output of the i-th instruction register Is selected by the 
instruction decoder, and for Inducing the i-th instruction 
register to hold the outputs of the 1-th memory means after 
the update; wherein a program is stored in the 1-st to n-th 
memory means in units of one word in the order of the 1-st 
memory means to the n-th. 

With the ■ ■ invention, since the 

branch instruction and the instruction at the destination of 
branch are executed continuously without interruption and 
without the compiler performing any special processing for 
the branch instruction, it is possible to speed up the 
processing compared to the prior art without complicating 

the compiler. 

In accordance with a preferred first mode of 
. — the present invention, there is 



provided the processor according to the invention 
wherein the outputs of the 1-st to n-th instruction 
registers are supplied to input terminals of the 
execution circuit via corresponding 1-st to n-th 
bypasses; and the oontrol circuit decides whether or 
not the instruction decoded by the i-th instruction 
decoder is an immediate data transfer instruction based 
upon the output of that decoder and, if it is 
determined to be an immediate data transfer 
instruction, induces the execution <:ircuit to f^tch t:he 
immediate data through the i-th bypass in or-der t:o 
execute the immediate data transfer instruction at 
once. 

With the first mode, an immediate data transfer 
instruction is executed in one cycle without 
interruption, achieving faster processing in -comparison 
with the prior art, which requires a plurality of 
cycles • 

In accordance with a preferred second mode -of the 
present invention, the processor according to the 
invention further comprises: a multiple length 
instruction decoder for instructions of length N, where 
2sNsn, for decoding N successive wor^is in the outputs 
of the 1-st to n-th instruction r^egisters as a multiple 
length instruction and for supplying the decoding 
result to the execution circuit ; wherein the control 
circuit, when the output of one of the 1-st to n-th 
instruction decoders indicates a multiple length 
instruction of length N, induces the multiple length 
instruction decoder to decode the multiple length 
instruction of length N, and indxices the ^aid one of 
the instructic^i -decoders to decode the instruction 
following after — ; — ■ 
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the multiple length instruction of length N. 

With the second mode, a nultiple length instruction of length N 
is executed in one cycle without interruption, achieving 
faster processing in comparison with the prior art which 
requires a plurality of cycles- 



FIG. 1 is a block diagram showing a processor in the 
first embodiment according to the present invention; 

FIG. 2 is a timing chart that shows normal pipeline 
processing performed after a reset and up to the time when 
stationary state is achieved In the device in FIG. 1; 

FIG. 3 is a timing chart that shows pipeline 
processing performed for an unconditional branch instruction 

in the device in FIG. 1; 

FIG. 4 is a timing chart that shows pipeline 
processing performed for a conditional branch instruction in 

the device in FIG. 1; 

FIG. 5 is a block diagram showing a processor in the 
second embodiment according to the present invention; 

FIG. 6 is a timing chart that shows pipeline 
processing performed for an immediate data transfer 
instruction in the device in FIG. 5; 

FIG. 7 is a block diagram showing a processor in the 
third embodiment according to the present invention; 
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FIG. 8 is a timing chart that shows pipeline 
processing performed for a double length Instruction in the 
device in FIG. 7: 



( 
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FIG. 9 is a block diagram showing a processor in the 

fourth embodiment according to the present invention; 

FIG. 10 is a timing chart that shows normal pipeline 

=ft#»r a reset up to the time when the 
processing performed after a resei xxy 

stationary state is achieved In the device In FIG. 18; 

FIG. 11 is a block diagram showing a processor in the 
fifth embodiment according to the present invention; 

FIG. 12 is a timing chart that shows pipeline 
processing performed for an immediate data transfer 
instruction In the device in FIG. 20; and 

FIG. 13 is a block diagram showing a processor In the 
sixth embodiment according to the present Invention, 



Referring now to the drawings, wherein like reference 
Characters designate like or corresponding parts throughout 
several views, embodiments of the present invention are 
described below. 

First embodiment 

FIG. 1 shows a processor in the first embodiment 
according the present invention. 

The memory lA and the memory IB have an Identical 
structure with equal storage capacity and one program is 
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Stored in the two memories lA and IB by dividing the program 
into one word units which are written alternately into each 
of the two memories. Namely, with the addresses in the 
memory lA designated as AO. Al . A2. ... and with the 

addresses in the memory IB designated as BO. Bl. B2 

the program is stored in the order: AO. BO. Al. Bl. A2. B2. 

• • • « 

The address input terminal of the memory lA is 
connected to the output terminal of the program counter 2A 
via exclusive wires and the address Input terminal of the 
memory IB is connected to the output terminal of the program 
counter 2B via exclusive wires. Each of the program 
counters 2A and 2B has a two stage structure consisting of 
the normal counter unit, which is the input stage, and the 
holding (register) unit which is the output stage and. as 
explained later, the contents PAN which have been updated 
and confirmed at the input stage, is held at the output 
stage as PA in response to a control signal. 

The data output terminal of the memory lA is connected 
to the input terminal of the instruction register 3A via 
exclusive wires and the data output terminal of the meinory 
IB is connected to the input terminal of the instruction 
register 3B via exclusive wires. The memory lA constantly 
supplies the contents lAl at the address PA specified by the 
program counter 2A to the input terminal of the instruction 
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register 3A and the memory IB constantly supplies the 
contents IBl at the address PB specified by the program 
counter 2B to the Input terminal of the Instruction register 
3B. The instruction registers 3A and 3B hold lAl and IBl 
respectively, to output them as IA2 and IB2 respectively In 
response to a strobe signal. 

The output terminals of the Instruction registers 3A 
and 3B are connected to the Input terminals of the 
instruction decoders 4A and 4B respectively. Each of the 
instruction decoders 4A and 4B is provided with an Internal 
register at the input stage and decodes the Instruction code 
held m this register for output as DA or DB. 

The output terminals of the instruction decoders 4A 
and 4B are connected to the input terminals of the execution 
circuit 5 and the control circuit 6. The execution circuit 
5 is provided with a selector at the internal Input stage, 
which selects either DA or DB in response to the selection 
control signal, and a register that holds the selected DA or 
DB. Based upon the decoding result, which Is the output of 
the decoder.held In this register, the execution circuit 5 
executes processing such as calculation or data transfer In 
the same manner as an execution circuit in the known art. 
which is provided with an ALU and Internal registers. To 
simplify the expression, we proceed on the premise that 
selection either DA or DB means selection either DA or DB 
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and holding It In the register. 

These components lA to 4A. IB to 4B and 5 operate In 
accordance with control signals sent from the control 
circuit 6. These control signals are generated in 
synchronism with the double phase clocks CA and CB as shown 
in FIG, 2. The basics of the control performed by the 
control circuit 6 are as follows: 

(1) The control circuit 6 induces the execution 
circuit 5 to select and execute DA and DB alternately, 
supplies the selection control signal to the execution 
circuit 5 with the timing of the rise of the clock CA and 
induces it to select DB when It has completed ^he execution 
of the output of the instruction decoder 4A and induces it 
to select the decoding result DA when it has completed the 
execution of the decoding result of the instruction decoder 
4B. Then it induces the internal register to hold these 
decoding result. The Initial sel-ection after a reset is DA. 

(2) When DA is selected, the control circuit « induces 
the instruction decoder 4A to hold IA2 and when DB is 
selected, it induces the instruction decoder 4B to hold IB2, 

(3) When IA2 is held in the instruction decoder 4A. 
the control circuit 6 induces the instruction register 3A to 
hold lAl and when IB2 is held In the instruction decoder 4B. 
it induces the instruction register 3B to hold IBl. 

(4) The control circuit 6 induces the contents PAN -or 
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PBN at the input stage of the program counter 2A or 2B to be 
held at the output stage as PA or PB with the timing of the 

rise of the clock CA. 

(5) The control circuit 6 induces lAl to be held in 
the Instruction register 3A or IBl to be held in the 
instruction register 3B with the timing of the rise of the 
clock CB. 

(6) The control circuit 6 updates the contents PAN or 
PEN at the input stage of the program counter 2A or 2B with 
the timing of the rise of the clock CB. Updating of PAN Is 
usually performed by supplying one pulse to the clock input 
terminal of the program counter 2A to add one to the counter 
value, but when the output of the instruction decoder 4A 
indicates an unconditional branch instruction, is performed 
by determining the address of the branch destination based 
upon the output of the instruction decoder 4A and then by 
loading it to the program counter 2A. and when the output of 
the instruction decoder 4A indicates a conditional branch 
instruction, is performed by determining the address of the 
branch destination based upon the output of the instruction 
decoder 4A and a status flag and then by loading it to the 
program counter 2A. The updating of the program counter 2B 
is executed in the same manner as that of the program 
counter 2A. 

(A) Next, the normal pipeline processing that is 
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perforroed after the processor is reset until the processor 
enters the stationary state is explained in reference to FIG. 
2. 

The starting address for program execution after a 
reset is designated as n. Although not shown in FIG. 1 or 
FIG. 2. when the processor is reset, initializing processing, 
is performed in which n is loaded to the input stages of the 
program counters 2A and 2B and. On the other side, the 
execution starting address n is supplied to the input 
terminals of the memory lA and IB through another route (not 
shown ) . 

The rise points in time of the clock CA are assigned 
odd numbers; tl. t3. tS, ... and the rise points In time of 

the clock CB are assigned even numbers: t2, t4, t6 In 

addition. lAl. which is read out from the address 1 in the 
memory lA is Indicated as lAl(i). and IA2 when lAl(i) is 
held in the instruction register 3A is indicated as IA2(i). 
The same rule applies to IBl and IB2. In the following 
explanation, the pipeline has S stages, -consisting of the 
instruction fetch (IF) stage, the instruction decode (lO) 
stage, the execution (EX) stage, the memory a<:cess (MA) 
stage and the stage of writing to register (WB). For 
example, in the case of an instruction with which data are 
read out from the memory address that is the content of the 
index register IX with 100 added, to load to the -register R; 
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i .e. , 

LOAD R, IX + 100: 
the EX stage Is the processing la which 100 Is added to the 
contents Of the Index register IX to detennlne the execution 
address, the MA stage is the processing In which data are 
read out from this execution address In the memory and the 
. WB stage IS the processing In which the data that have been 
read out are stored In the register R. In the case of an 
instruction as to register-register operation such as a 
register-register compare Instruction, EX stage Is the 
operation between two registers. MA stages is meaningless 
and WB stage is the storing the result of the operation In a 
register. In the case of an immediate data transfer 
Instruction; i.e., 
LDI R. 200 

WB stage is the processing in which an immediate data 200 
are stored in the register R. and EX and MA stages are 

meaningless. 

Of the control signals output from the control circuit 
6. those shown in FIG. 2 are as follows: the Instruction lA 
decode signal is for Inducing the Instruction decoder 4A 
holding IA2. to decode it when it is at high and the 
instruction IB decode signal is for inducing the Instruction 
decoder 4B holding IB2. to decode It when it is at high. 
The instruction lA execution signal Is for Inducing the 
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execution circuit 5 to select and execute DA when it is at 
high and the instruction IB execution signal is for inducing 
the execution circuit 5 to select and execute DB when it is 
at high. IF stage starts from the updating of PA or PB of 
the program counter 2A or 2B. 

(tl) n is loaded to both the program counters 2A and 
2B. PA = n and PB = n. The instruction decoders 4A, 4B and 
the execution circuit 5 are in the wait state, 

(t2) lAl(n) and IBl(n) are held in the instruction 
registers 3A and 3B respectively. One pulse is supplied to 
the clock input terminals of the program counters 2A and 2B 
so that PAN = n-fl and PBN = n+l. The instruction decoders 
4A. 4B and the execution circuit 5 are In the wait state. 

(t3) IA2(n) and IB2(n) are held in the instruction 
decoders 4A and 4B respectively and decoded. PA = n+l and 
PB = n+l. The execution circuit 5 is in the wait state. 

(t4) IAl(n-4-l) and IBl(n+l) are held in the 
Instruction registers 3A and 3B respectiv-ely . One pulse is 
supplied to the clock input terminals of the program 
counters 2A and 2B so that PAN=n + 2 and PBN = n + 2 . The 
execution circuit 5 is in the wait state. 

(tS) DA(n) is selected by the execution circuit 5 and 
executed to perform BX stage. On the other side, DB(n) is 
not selected and the instruction decoder 4B enters the wait 
state. This wait state means the output of the instruction 
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decoder 4B is not used with outputting the decoding result. 
IA2(n+l) is held in the instruction decoder 4A and decoded. 
PA = n + 2 . 

(t6) IAl(n+2) is held in the instruction register 3A. 
one pulse is supplied to the clock input terminal of the 
program counter 2A so that PAN = n + 3. 

(t7) DB(n) is selected by the execution circuit 5 and 
executed. DA(n-l-l) is not selected and the instruction 
decoder 4A enters the wait state. The memory access which 
corresponds to DA(n) is executed by the control circuit 6. 
IB2(n-4.l) is held in the instruction decoder 4B and decoded. 
PB « n + 2 . 

(t8) IBl(n+2) is held in the instruction register 3B. 
one pulse is supplied to the clock input terminal of the 
program counter 2B so that PBN = n + 3. 

(t9) DA(n+l) is selected by the execution circuit 5 
and executed. DB(n+l) is not selected and the instruction 
decoder 4B enters the wait state. The write to register 
that corresponds to DA(n) and the memory access which 
corresponds to DB(n) are executed by the control circuit 6. 
IA2(n + 2) is held in the instruction decoder 4A and decoded. 

PA = n + 3 . 

The five-stage normal pipeline processing is executed 
m the manner described above and the operation enters the 
stationary state- 
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(B) Now. the pipeline processing for an unconditional 
branch instruction that is executed after the processor 
enters the stationary state is explained in reference to FIG. 
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We proceed on the premise that this unconditional 
branch instruction is stored at address An (address n in the , 
memory lA). that a relative branch destination address K is 
in the word of the unconditional branch instruction and that 
the relative branch destination address K is an even number 

2k and is in memory lA. 

The actual branch destination for the relative branch 
destination address K is determined in the same manner as 
with a processor in the prior art. in which the memory lA 
and the memory IB are Integrated, by regarding address 1 in 
the memory lA as 21 and also regarding the address 1 in the 
memory IB as 21 + 1 for each 1 that is 0 to M. To put it 
more concretely, when the unconditional branch Instruction 
is stored in the memory lA (or the memory IB) and K is an 
odd number 2k +1. i.e.. when the lowest-order bit of K is 
-1-. the branch destination relative addr^ess is equal to the 
branch destination relative address k In the memory IB (or 
the memory lA) and when K is an even number 2k. i.e.. when 
the lowest-order bit of K is "0." the branch destination 
relative address is equal to the branch destination . relative 
addness k in the memory lA (or the memory IB). 



-18 - 



The following is an explanation having part of the 
pipeline processing of an instruction word preceding the 
address n - 1 in the memory lA and preceding the address n-1 

in the memory IB. 

(tl) DA(n-2) is selected by the execution circuit 5 
and executed. On the other side. DB(n) is not selected and 
the instruction decoder 4B enters the wait state. IA2(n-l) 
is held in the instruction decoder 4A and decoded. PA = n. 

(t2) The unconditional branch instruction lAl(n) is 
held in the instruction register 3A. One pulse is supplied 
to the clock input terminal of the program counter 2A so 

that PAN = n + 1 . 

(t3) DB(n-2) is selected by the execution circuit 5 
and executed. On the other side. DA(n-l) Is not selected 
and the instruction decoder 4A enters the wait state. 
IB2(n-l) is held in the instruction decoder 4B and decoded. 
PB = n. 

(t4) IBl(n) is held in the instruction register 3B. 
One pulse is supplied to the clock input terminal of the 
program counter 2B so that PBN = n+l. 

(tS) DA(n-l) is selected by the execution circuit 5 
and executed. On the other side. DB(n-l) is not selected 
and the instruction decoder 4B enters the wait state. IA2(n) 
is held m the instruction decoder 4A and decoded. PA-n+1. 

(t6) IAl(n+l) is held in the instruction register 3A. 
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It is determined by the control circuit 6 that DA(n) 
indicates an unconditional branch instruction and the k of 
the relative branch destination address K=2k which is 
contained in DA(n) is loaded to the program counters 2A and 
2B so that PAN = k and PBN = k. With this, the subsequent 
operation becomes identical to the operation after a reset. 

(t7) DB(n-l) is selected by the execution circuit 5 
and executed. On the other side. DA(n) is not selected and 
the instruction decoder 4A enters the wait state. .IB2(n) is 
not held in the instruction decoder 4B (is cancelled). PA= 
k and PB = k, 

(t8) lAl(k) and IBl(k) are respectively held In the 
instruction registers 3A and 3B. PAN = k+l and PBN«k+l. 

(t9) The decoding result DA(n) of the unconditional 
branch instruction is selected by the execution circuit 5 
and executed without meaning. This is because k has been 
loaded into the program counters 2A and 2B at the time t«. 
thus the execution of the unconditional branch has been 
performed. IA2(k) and IB2(k) are held in the instruction 
decoders 4A and 4b respectively and decoded. PA»k+I and 
PB = k + 1 . 

(tlO) IAl(k+l) and IBl(k + l) are held in the 
Instruction registers 3A and SB respectively. PAN = k + 2 and 
PBN -= k + 2 . 

(til) DA(k) is selected by the execution circuit S and 
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executed. On the other side. DB(k) is not selected and the 
instruction decoder 4B enters the wait state. The memory 
access which corresponds to DA(n) is executed without 
meaning by the control circuit 6 (the MA stage corresponding 
to DA(n) is meaningless). IA2(k+l) Is held in the 
Instruction decoder 4A and decoded. PA=k + 2. 

AS has been explained, since the unconditional branch 
instruction and the instruction at the branch destination 
are executed continuously without any interruption, a delay 
m processing is prevented. Although FIG. 3 illustrates the 
case in which the branch destination is Ak. a delay in 
processing is prevented in a similar manner when the branch 

destination Is Bk. 

(C) Next, the pipeline processing for a conditional 
branch instruction that is executed after the processor 
enters the stationary state is explained in reference to FIG, 
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The following discussion is based on the premise that 
a register-register compare instruction is stored at address 
An. that the conditional branch instruction is stored at 
address Bn and that the branch destination address is 
determined to be either An-f 1 or Bk depending upon the 
result of the execution of this compare instruction in EX 
stage . 

in FIG. 4. the operation from the time point tl 
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through the time point t7 is obvious from the above 
explanation in reference to FIGS. 2 and 3. and it is. 
therefore, omitted here. The time points t3 to t7 
correspond to the time points tl to tS in FIG. 3. 

(t8) It is determined by the control circuit 6 that 
DB(n) is a conditional brknch Instruction. With this 
decision, the k of the relative branch destination address K 
=2k. which is contained in DB(n) . is loaded to the program 
counter 2B so that PBN = k. Then k + I is loaded to the 
program counter 2A as the address following the Bk so that 
PAN=k+l. With this, the subsequent operation becomes 
similar to that performed after a reset. 

(t9) The decoding result DA(n) of the compare 
instruction Is selected by the execution circuit S and 
executed. On the other side. DB(n) is not selected and the 
instruction decoder 4B enters the wait state. IA2(n+l) is 
held in the instruction decoder 4A and decoded. l>B = k+I 
and PB = k. 

(tlO) IAl(k+l) and IBl(k) are is held in the 
instruction registers 3A and 3B respectively. One pulse is 
supplied to the clock input terminals of the program 
counters 2A and 2B so that PAN = k+2 and PBN = k+l. 

(til) With the result of the execution of the compare 
instruction described above, the branch destination of the 
conditional branch Instruction is determined. In FIC 4. the 
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branch destination address is determined to be Bk and DA(n+ 
1) Is cancelled. The decoding result DB(n) of the 
conditional branch Instruction Is selected by the execution 
circuit 5 and executed formally. The memory access that 
corresponds to DA(n) Is executed without meaning by the 
control circuit 6 (the MA stage corresponding to DA(n) is 
meaningless). IA2(k + l) and IB2(k) are held in the 
Instruction decoders 4A and 48 respectively and decoded. 
PB = k + 1 • 

(tl2) IBl(k+l) is held in the instruction register 3B. 
PBN = k + 2 . 

(tl3) The decoding result DB(k) of the instruction at 
the branch destination is selected by the execution circuit 
5 and executed. On the other side, DA(k+l) is not selected 
and the instruction decoder 4A enters the wait state. 
The write to register that corresponds to DA(n) and the 
memory access that corresponds to DB(n) are executed without 
meaning by the control circuit 6 (the WB stage corresponding 
to DA(n) and the MA stage corresponding to DB(n) are 
meaningless). IB2(k + l) is held in the instruction decoder 
4A and decoded. PA = k4-2. 

As has been explained, since the conditional branch 
instruction and the instruction at the branch destination 
are executed continuously without any interruption, a delay 
in processing is prevented. Although FIG. 4 illustrates a 
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case in which the branch destination is Bk. a delay In 
processing is prevented in a similar manner when the branch 
destination is Ak. 

Second embodiment 

FIG. 5 shows a processor In the second embodiment 
according to the present Invention. 

in this processor, in order to supply the immediate 
data at the second word in an immediate data transfer 
instruction directly to the execution circuit 5A from the 
instruction registers 3A and 3B. the output terminals of the 
instruction registers 3A or 3B are directly connected to the 
input terminals of the execution circuit 5A via the bypass 

7A and 7B respectively. 

When the control circuit 6A determines that the output 
of the instruction decoder 4A or 4B indicates an immediate 
data transfer instruction, it induces the execution .:ircuit 
5A to directly hold IA2 or IB2 via the bypass 7A or 7B and 
to execute it in one cycle. 

All other aspects are identical to those of the 

processor shown In FIG. 1. 

FIG. 6 shows the pipeline processing for the immediate 
data transfer Instruction after the processor has entered 

the stationary state- 

we proceed on the premise that the first word of the 
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immediate data transfer instruction is stored at the address 
An and that the immediate data. i.e.. the second word of the 
immediate data transfer instruction, are stored at the 
address Bn. 

(tl) DA{n-2) is selected by the execution circuit 5A 
and executed. On the other side. DB(n-2) is not selected 
and the Instruction decoder 4B enters the wait state. 
IA2(n-l) is held m the instruction decoder 4A and decoded. 
PA = n . 

(t2) The unconditional branch instruction lAl(n) is 
held m the Instruction register 3A. One pulse is supplied 
to the clock input terminal of the program counter 2A so 

that PAN = n + 1 . 

(t3) DB(n-2) is selected by the execution circuit 5A 
and executed. On the other side. DA(n-l) Is not selected 
and the instruction decoder 4A enters the wait state. 
IB2(n-l) is held in the instruction decoder 4B and decoded. 
PB = n. 

(t4) The immediate data IBl(n) is held in the 
instruction register 3B. One pulse is supplied to the clock 
input terminal of the program counter 2B so that PBN=n+I. 

(tS) DA(n-l) is selected by the execution circuit 5A 
and executed. On the other side. DB(n-l) is not selected 
and the instruction decoder 4B enters the wait state. 
IA2(n) is held in the instruction decoder 4A and decoded. 
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PA = n + 1 . 

(t6) IAl(n+l) is held in the instruction register 3A. 
PAN = n+2- The control circuit 6A determines that DA(n) 
Indicates an immediate data transfer instruction and 
therefore, that IB2(n) is immediate data. 

(t7) Based upon the above determination. IB2(n) is not 
held in the instruction decoder 4B but is held in the 
internal register of the execution circuit SA via the bypass 
7B. DB(n-l) is selected by the execution circuit SA and 
executed. DA(n) is not selected and the instruction decoder 
4A enters the wait state. PB = n+l. 

(te) IBl(n+l) is held in the Instruction register 3B. 

PBN «= n + 2 . 

(tS) The decoding result DA(n) of the immediate data 
transfer instruction is selected and executed without 
meaning by the execution circuit SA (the EX stage 
corresponding to DA(n) is meaningless). IA2(n+ 1) and IB2(n 
+ 1) are held in the Instruction decoders 4A and 4B 
respectively and are decoded. PA = n+2 and PB = n+2. 

(tlO) IAl(n + 2) and IBl(n+2) are held in the 
instruction registers 3A and 3B respectively. PAN=n + 3. 

(til) DA(n+l) is selected by the execution circuit "SA 
and executed. On the other side. DB(n+l) is not selected 
and the Instruction decoder 4B enters the wait state. 
The memory access that corresponds to DA(n) is executed 
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without meaning by the control circuit 6A (the MA stage 
corresponding to DA(n) Is meaningless). IA2(n+2) is held 
in the instruction decoder 4A and decoded. 

The data transfer of an immediate data to a register 
is executed in the WB stage corresponding to DA(n). 

As has been explained so far, the double length 
immediate data transfer instruction which includes immediate 
data is executed without interruption in one cycle, 
preventing any delay in processing. 

Third embodiment 

FIG. 7 shows a processor in the third embodiment 
according to the present invention. 

In order to execute a double length Instruction faster, 
this processor is provided with a double length instruction 
decoder 4N in addition to the instruction decoders 4A and 4B. 
The output terminals of the Instruction registers 3A and 3B 
are connected to the input terminals of the double length 
Instruction decoder 4N. and the output terminal of the 
double length Instruction decoder 4N is connected to the 
input terminal of the execution circuit 5B. The double 
length instruction decoder 4N is provided with a register 
that holds IA2 and IB2 at its internal Input stage and 
decodes double length instructions held in this register. 

When the control circuit 6B decides that the output of 
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the instruction decoder 4A or 4B indicates a double length 
instruction, it induces the double length instruction 
decoder 4N to hold IA2A and IB2 and induces it to decode the 
double length Instruction. The decoding of the double 
length instruction is executed In one cycle. 

All other aspects are Identical to those of the 
processor shown in FIG. 1. 

FIG. 8 shows the pipeline processing for a double 
length instruction when the processor has entered the 

stationary state. 

In FIG. 8, the double length decode signal is for the 
execution circuit 6B to Induce the double length instruction 
decoder 4N to hold IA2 and IB2 and decode them when it Is at 
high. 

We proceed on the premise that one double length 
instruction is stored at the addresses An and Bn. Since the 
operation at the time points tl to t5 is Identical to that 
at the time points tl to tS in FIG. 3. its explanation is 
omitted here. 

(t6) IA2(n+l) is held in the instruction register 3A. 
PAN = n+2. The control circuit 6B determines that DA(n) 
indicates the first word of the double length instruction. 

(t?) DB(n-l) is selected by the execution circuit "SB 
and executed. Based upon the decision described above. IA2 
(n) and IB2(n) are held at the input stag* of the double 
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.length instruction decoder 4N and decoded. On the other 
side, 

after this holding. IA2(n+l) is held in the instruction 
decoder 4A, and decoded. PA = n + 2 and PB = n+l. 

(t8) IAl(n+2) is held in the instruction register 3A 
and IBKn+l) is held in the instruction register 3B. PAN = 

n + 3 and PBN = n + 2 . 

(t9) The decoding result of the double length 
instruction decoder 4N is selected by the execution circuit 
5B and executed. IA2(n+l) is held in the Instruction 
decoder 4A and is decoded. PB = m-2. 

(tlO) IBl(n + 2) is held in the instruction register 3B. 

PBN = n + 3 . 

(til) DA(n+l) is selected by the execution circuit 5B 
and executed. On the other side, bB(n+l) is not selected 
and the instruction decoder 4B enters the wait state. 
The memory accesses that correspond to DA(n) and DB(n) is 
executed by the control circuit 6B. IA2(n+2) is held in 
the instruction decoder 4A and decoded. PA = n + 3. 

As has been explained so far. the double length 
instruction is executed without interruption in one cycle, 
thus preventing a delay In processing. 
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Pourth embodiment 

FIG; 9 shows the processor in the fourth embodiment 
according to the present invention. 

in this processor, in order to simplify the structure 
compared to that shown in FIG. 1. only one instruction 
decoder 4 is used. Since there is only one instruction 
decoder 4. the input stage inside the execution circuit 
is not provided with a selector and Instead a selector which 
selects either the output of the Instruction register 3A or 
3B is provided at the next stage of the register located at 
the input stage within the instruction decoder 4. The only 
difference between the execution circuit 5C and the 
execution circuit 5 shown in FIG. 1. is that the execution 
circuit SC is not provided with a selector at the internal 
input stage. The only difference between the instruction 
decoder 4 and the instruction de 4A shown in FI<3. 1. Is that 
the instruction decoder 4 is provided with the selector at 
the Internal input stage. 

Since there is only one Instruction decoder 4. the 
control performed by the control circuit 6F is simpler than 
that performed by the <:ontrol circuit 6 shown in FIG. l. 
When IA2 is held in the instruction decoder 4. -the <:ontrol 
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Circuit 6F adds one to the PAN and induces the i-tfuction 
register 3A to hold lAl. When IB2 is held in m 
instruction decoder 4. the control circuit 6F — one to 
the PBN and induces the instruction register hold IBl. 

All other aspects of this processor are ifcttical to 
those of the processor shown in FIG. 1. 

The normal pipeline processing that is v^ed after 
the processor is reset and until the processor -*ers the 
stationary state is explained in reference to »L 10 . The 
initializing processing not shown in the figu««iit Is 
executed immediately after a reset is identic—, that 
performed in the first embodiment described e^r. 

(tl) PA = n and PB = n. The instruction *-dter 4 and 
the execution circuit 5C are in the wait static 

(t2) lAl(n) and IBI(n) are held in the 1-truction 
registers 3A and 3B respectively. PAN = n-. 1 -*WN = n+ 1- 
(t3) IA2(n) is held in the instruction *-#Br 4 and 
is decoded. IB2(n) enters the wait state. P*=«*l- The 
execution circuit 5C is in the wait state. 

(t4) IAl(n+l) is held in the instruct!- agister 3A. 

PAN = n + 2 . 

(tS) DA(n) is held by the execution clm« 5C and 
executed. IB2(n) is held in the instruction *-der 4 and 

decoded. PB = n+l. 

(M) IBX(n+l) U held In the lnstructl-«glster 3B 
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PBN = n + 2 . 

(t7) DB(n) Is executed by the execution circuit sq. 
The memory access that corresponds to DA(n) Is executed by 
the control circuit 6F. IA2(n+D is held In the 
Instruction decoder 4 and decoded. PA = n+2. 

(t8) IAl(n + 2) is held in the instruction register 3A. , 

PAN = n + 3 . 

(t9) DA(n+l) is executed by the execution circuit SC. 
The memory write that corresponds to DB(n) and the write 4;o 
register that corresponds to DA(n) are executed by the 
control circuit 6F. IB2(n+l) is held in the instruction 
decoder 4 and decoded. PB = n+2. 

The normal pipeline processing with 5 stages is 
executed In the manner described above and the processor 
enters the stationary state. Since the pipeline processing 
for an unconditional branch instruction and a conditional 
branch instruction executed after the processor enters the 
stationary state can be easily understood from the earlier 
explanation, its explanation is omitted here. With this 
processor too. the advantage that- has been described already, 
that an unconditional branch instruction and a conditional 
branch instruction can be executed without Interruption in 
one cycle, is achieved. 

Fifth embodiment 
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FIG. 11 Shows the processor in the fifth embodiment 
according to the present invention. 

This processor is a simplified version of the 
structure shown in FIG. 5. having incorporated a similar 
simplification to that shown in FIG. 9. Namely, in order 
to directly supply the Immediate data at the second word of 
an immediate data transfer instruction to the execution 
circuit 5D from the instruction registers 3A or 3B, the 
output terminals of the instruction registers 3A and 3B are 
directly connected to the input terminals of the execution 
circuit 5D via the bypass 7A and 7B respectively. 

FIG. 12 shows the pipeline processing for an Immediate 
data transfer instruction that is executed after the 
processor enters the stationary state. The conditions for 
the immediate data transfer instruction are the same as 
those in the case in FIG. 6. 

(tl) DB(n-2) is held by the execution circuit 5D and 
executed, FA = n. IA2(n-l) is held in the instruction 

decoder 4 and decoded. 

(t2) lAl(n) is held in the instruction register 3A. 

PAN = n + 1 - 

(t3) DA(n-l) is held by the execution circuit «D and 
executed. IB2(n-l) is held in the instruction decoder 4 

and decoded. PB = n. 

(t4) IBl(n) is held in the instruction register SB. 
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PBN = n + 1 . 

(t5) DB(n-l) is held by the execution circuit 5D and 
executed. The immediate data transfer instruction IA2(n) is 
held m the instruction decoder 4 and is decoded. PA = n+l. 

(t6) IAl(n+l) is held In the instruction register 3A 
so that PAN = n+2. The control circuit €G decides that IA2 
(n) is an immediate data transfer instruction and. therefore, 
that IB2(n) is immediate data. The control clr<:ult «G 
controls to sends IB2(n) as immediate data to the execution 
circuit 5D via the bypass 78. 

(t7) The decoding result DB(n) of the immediate data 
transfer Instruction is executed without meaning by the 
execution circuit 5D. The memory access that corresponds to 
DA(n) is executed without meaning by the control circuit 6G. 
IA2(n+l) is held in the instruction decoder 4 and is 

decoded. PB = n+l. 

(18) IBl(n+l) is held in the Instruction register 3B. 

PBN = n + 2 . 

(t9) DA(n+l) is executed by the execution circuit SD. 
The memory access that corresponds to DA(n) Is executed 
without meaning by the control circuit 6€. IB2(n+l) is 
held in the Instruction decoder 4 and decoded. PA = n+2. 

The data transfer of a double length immediate data to 
a register is executed in the WB stage corresponding to 
DA(n). 
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As has been explained so far. since the double length 
immediate data transfer instruction that includes immediate 
data is executed without interruption in one cycle, a delay 
in processing is prevented. 

Sixt:h embodiment 

FIG. 22 shows the processor In the sixth embodiment 
according to the present invention. 

This processor Is a simplified version of the 
structure shown In FIG. 7. having incorporated a similar 
simplification to that shown in FIG. 9. Namely, in order 
to execute a double length Instruction faster, a double 
length instruction decoder 4N is added to the structure 
shown in FIG. 9 in addition to the instruction decoder 4. 
and the output terminals of the Instruction registers 3A and 
3B are connected to the input terminals of the double length 
instruction decoder 4N via the wires 8A and 8B respectively 
and the output terminal of the double length Instruction 
decoder 4N is connected to the input terminal of the 
execution circuit SE. 

Since the operation performed by the control circuit 
6H of this processor can be easily understood from the 
earlier explanation, its explanation is omitted here. With 
this processor too. the advantage described earlier, that a 
double length Instruction can be executed without 
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interruption in one cycle is achieved. 
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CLAIMS ; 

A processor comprising for each 1 that is 1 to n: 
an i-th program counter: 

l-th memory »eans for being addressed with the output 
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from said i-th program counter; and 

an 1-th Instruction register for holding the outputs 
from said i-th memory means; 

said processor further comprising: 

an instruction decoder for selecting one of the 
outputs from said 1-st to n-th instruction registers and for 
decoding the selected output: 

an execution circuit for executing processing based 
upon the output from said Instruction decoder; and 

a control circuit for inducing said instruction 
decoder to select the outputs from said 1-st to n-th 
instruction registers sequentially, for Inducing said i-th 
program counter to update after the output of said i-th 
Instruction register Is selected by said instruction decoder, 
and for inducing said 1-th instruction register to hold the 
outputs of said 1-th memory means after said update; 

wherein a program is stored in said 1-st to n-th 
memory means in units of one word in the order -of said 1-st 
memory means to said n-th. 

2. A processor according to claim 1 wherein: 

the outputs of said 1-st to n-th instruction registers 
are supplied to input terminals of said execution 
circuit via corresponding 1-st to n-th bypasses; and 

said control circuit decides whether or not the instruction 
decoded by the i-th instruction decoder is an iirmediate -data transfer 
instruction hrifyad — — 
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upon the output of that decoder and, if it is 
determined to be an immediate data transfer 
instruction, induces said execution circuit to fetch 
the immediate data through the i-th bypass in order to 
execute the immediate data transfer instruction at 
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3. A processor according to claim 7, further 

comprising: 

a multiple length instruction decoder for 
instructions of length N, where 2*Nsn, for decoding N 
successive words in the outputs of said 1-st to n-th 
instruction registers as a multiple length instruction 
and for supplying the decoding result to said execution 
circuit ; 

wherein said control circuit, when the output of 
one of the 1-st to n-th instruction decoders indicates 
a multiple length instruction of length N. induces said 
multiple length instruction decoder to decode said 
multiple length instruction of length N, and induces 
the said one of the instruction decoders to decode 
the instruction following after said multiple length 
instruction N. 

4'. A processor substantially as herein described with 
reference to and as shown in Figures 9 and 10, 11 and 
12 or 13 of the accompanying drawings. 
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